Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm][Support] fix convertToSnakeFromCamelCase #68375

Merged
merged 5 commits into from
Oct 6, 2023

Conversation

makslevental
Copy link
Contributor

@makslevental makslevental commented Oct 6, 2023

Currently runs of caps aren't handled correctly so e.g. something like Intel_OCL_BI is snake cased to intel_o_c_l_b_i (previously discussed on this phabricator patch).

@llvmbot
Copy link
Member

llvmbot commented Oct 6, 2023

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-llvm-adt

Changes

Currently runs of caps aren't handled correctly so e.g. something like Intel_OCL_BI is snake cased to intel_o_c_l_b_i (previously discussed on this phabricator patch). The easiest way to fix was to use regexes instead of the existing loop. Full disclosure the regexes were pulled from inflection but it should be pretty clear they're correct.


Full diff: https://github.com/llvm/llvm-project/pull/68375.diff

2 Files Affected:

  • (modified) llvm/lib/Support/StringExtras.cpp (+7-11)
  • (modified) llvm/unittests/ADT/StringExtrasTest.cpp (+5)
diff --git a/llvm/lib/Support/StringExtras.cpp b/llvm/lib/Support/StringExtras.cpp
index 5683d7005584eb2..fd5a34fb3d6e82c 100644
--- a/llvm/lib/Support/StringExtras.cpp
+++ b/llvm/lib/Support/StringExtras.cpp
@@ -12,6 +12,7 @@
 
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/Support/Regex.h"
 #include "llvm/Support/raw_ostream.h"
 #include <cctype>
 
@@ -96,18 +97,13 @@ std::string llvm::convertToSnakeFromCamelCase(StringRef input) {
   if (input.empty())
     return "";
 
-  std::string snakeCase;
-  snakeCase.reserve(input.size());
-  for (char c : input) {
-    if (!std::isupper(c)) {
-      snakeCase.push_back(c);
-      continue;
-    }
-
-    if (!snakeCase.empty() && snakeCase.back() != '_')
-      snakeCase.push_back('_');
-    snakeCase.push_back(llvm::toLower(c));
+  std::string snakeCase = input.str();
+  for (int i = 0; i < 10; ++i) {
+    snakeCase = llvm::Regex("([A-Z]+)([A-Z][a-z])").sub("\\1_\\2", snakeCase);
+    snakeCase = llvm::Regex("([a-z0-9])([A-Z])").sub("\\1_\\2", snakeCase);
   }
+  std::transform(snakeCase.begin(), snakeCase.end(), snakeCase.begin(),
+                 [](unsigned char c) { return std::tolower(c); });
   return snakeCase;
 }
 
diff --git a/llvm/unittests/ADT/StringExtrasTest.cpp b/llvm/unittests/ADT/StringExtrasTest.cpp
index 3f69c91b270a355..fab562f1ed0d594 100644
--- a/llvm/unittests/ADT/StringExtrasTest.cpp
+++ b/llvm/unittests/ADT/StringExtrasTest.cpp
@@ -184,6 +184,11 @@ TEST(StringExtrasTest, ConvertToSnakeFromCamelCase) {
 
   testConvertToSnakeCase("OpName", "op_name");
   testConvertToSnakeCase("opName", "op_name");
+  testConvertToSnakeCase("OPName", "op_name");
+  testConvertToSnakeCase("opNAME", "op_name");
+  testConvertToSnakeCase("opNAMe", "op_na_me");
+  testConvertToSnakeCase("opnameE", "opname_e");
+  testConvertToSnakeCase("OPNameOPName", "op_name_op_name");
   testConvertToSnakeCase("_OpName", "_op_name");
   testConvertToSnakeCase("Op_Name", "op_name");
   testConvertToSnakeCase("", "");

@github-actions
Copy link

github-actions bot commented Oct 6, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@makslevental makslevental requested a review from zero9178 October 6, 2023 15:52
llvm/lib/Support/StringExtras.cpp Outdated Show resolved Hide resolved
llvm/lib/Support/StringExtras.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@rkayaith rkayaith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor comments

llvm/lib/Support/StringExtras.cpp Outdated Show resolved Hide resolved
llvm/unittests/ADT/StringExtrasTest.cpp Outdated Show resolved Hide resolved
llvm/lib/Support/StringExtras.cpp Outdated Show resolved Hide resolved
Copy link
Member

@zero9178 zero9178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this version a lot better than the regex :)

LGTM % nits

llvm/lib/Support/StringExtras.cpp Show resolved Hide resolved
llvm/lib/Support/StringExtras.cpp Outdated Show resolved Hide resolved
@makslevental
Copy link
Contributor Author

image

@makslevental makslevental merged commit 767dcc5 into llvm:main Oct 6, 2023
@makslevental makslevental deleted the fix_snake_case branch October 6, 2023 21:33
makslevental added a commit that referenced this pull request Oct 9, 2023
This PR adds the additional generation of what I'm calling "value
builders" (a term I'm not married to) that look like this:

```python
def empty(sizes, element_type, *, loc=None, ip=None):
    return get_result_or_results(tensor.EmptyOp(sizes=sizes, element_type=element_type, loc=loc, ip=ip))
```

which instantiates a `tensor.EmptyOp` and then immediately grabs the
result (`OpResult`) and then returns that *instead of a handle to the
op*.

What's the point of adding these when `EmptyOp.result` already exists?
My claim/feeling/intuition is that eDSL users are more comfortable with
a value centric programming model (i.e., passing values as operands) as
opposed to an operator instantiation programming model. Thus this change
enables (or at least goes towards) the bindings supporting such a user
and use case. For example,

```python
i32 = IntegerType.get_signless(32)
...
ten1 = tensor.empty((10, 10), i32)
ten2 = tensor.empty((10, 10), i32)
ten3 = arith.addi(ten1, ten2)
```

Note, in order to present a "pythonic" API and enable "pythonic" eDSLs,
the generated identifiers (op names and operand names) are snake case
instead of camel case and thus `llvm::convertToSnakeFromCamelCase`
needed a small fix. Thus this PR is stacked on top of
#68375.

In addition, as a kind of victory lap, this PR adds a "rangefor" that
looks and acts exactly like python's `range` but emits `scf.for`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants